Prevent duplicate source parsing in ShardBulkInferenceActionFilter. #124186

jan-elastic · 2025-03-06T07:35:42Z

No description provided.

elasticsearchmachine · 2025-03-06T07:36:05Z

Pinging @elastic/ml-core (Team:ML)

jan-elastic · 2025-03-06T07:42:54Z

...ain/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilter.java


            final IndexRequest indexRequest = getIndexRequestOrNull(item.request());
-            var newDocMap = indexRequest.sourceAsMap();
+            var newDocMap = response.source();


This reuses the parsed source from line 511 instead of parsing it again.

Parsing it again causes the full input string to be extracted and stored again, which is another copy of a potentially large byte[]. The savings of this is very similar to Remove matched text from chunks.

I am not sure we can really call it a savings. With this change we'll keep all the parsed source in memory for the entire batch. That's what we wanted to avoid by re-parsing eagerly.
One possible optimisation here, would be to parse and write the value without keeping the intermediate Map object.

With this change there's one less copy of the (potentially long) input string around. In my experiments that's a clear win, because the source doesn't contain anything else.

It comes at the expense of keeping a parsed version of the rest of the source around. This costs some memory indeed, and becomes a memory loss if the rest of the source is larger than the input string. You might also argue that it's a computational win, because the parsed source is cached.

I'm not sure what you mean with:

One possible optimisation here, would be to parse and write the value without keeping the intermediate Map object.

In the end, a parsed version of the source is needed for indexing. If you re-parse the source, you get an additional copy of the input string, which is what I'm trying to prevent.

If you prefer, I could keep the source around only if size(input) > size(source) / 2 or so, because in that case it's (most likely) a memory win, and re-parse it otherwise. WDYT?

It's a batch API, we have more than one source at a time. This change would put all the parsed source in memory for the entire duration of the batch inference calls. We're not trying to optimise for the case where the bulk api contains a single document. This is something we can improve but this change alone doesn't save anything memory wise. The parsed source is ephemeral and kept for a single document at a time.

The input string is not ephemeral: it's referenced in the FieldInferenceRequest and the FieldInferenceResponse, and therefore arrives here.

Maybe we can instead remove the input string from the FieldInferenceResponse? That shouldn't be necessary, right?

What does the input string has to do with this change? Sorry I am not following

Removing the input from the response is something we forgot to follow up on after moving to the new format:
#118893 (comment)
cc @Mikep86

Sorry I lost track of that one @jimczi, I'll create a task to address this soon. But isn't that change orthogonal to this PR? We still need the parsed source both when generating the inference request (to get the values to perform inference on) and when handling the inference response (to update source).

Yes, that change seems orthogonal indeed.

Right now, profiling shows that when you get to the line

var newDocMap = indexRequest.sourceAsMap();

there's already a parsed version of the (potentially long) input string in memory, and this line is generating an additional copy. This PR prevents that, but I agree it also introduces some other memory usage.

I'm investigating right now whether I can prevent it without that side effect. To be continued...

I opened #124313 to reduce the overall memory needed to execute the bulk request. @Mikep86 I included the removal of the input from the response since it's a minimal change.

Mikep86

LGTM, thanks for the optimization!

jimczi

I left one comment, this change would increase memory consumption not reduce it.

Prevent duplicate source parsing in ShardBulkInferenceActionFilter.

8b0ce53

jan-elastic added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.19.0 v9.1.0 labels Mar 6, 2025

jan-elastic requested review from Mikep86, davidkyle and jimczi March 6, 2025 07:35

jan-elastic commented Mar 6, 2025

View reviewed changes

jan-elastic added the Team:Search - Inference label Mar 6, 2025

elasticsearchmachine removed the Team:Search - Inference label Mar 6, 2025

Mikep86 approved these changes Mar 6, 2025

View reviewed changes

jimczi requested changes Mar 6, 2025

View reviewed changes

jan-elastic requested a review from jimczi March 6, 2025 15:07

jan-elastic closed this Mar 7, 2025

jan-elastic mentioned this pull request Mar 7, 2025

Optimize memory usage in ShardBulkInferenceActionFilter #124313

Merged

jan-elastic deleted the optimize-ShardBulkInferenceActionFilter branch June 3, 2025 07:59

Prevent duplicate source parsing in ShardBulkInferenceActionFilter. #124186

Prevent duplicate source parsing in ShardBulkInferenceActionFilter. #124186

Uh oh!

Conversation

jan-elastic commented Mar 6, 2025

Uh oh!

elasticsearchmachine commented Mar 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jan-elastic Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimczi Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mikep86 left a comment

Choose a reason for hiding this comment

Uh oh!

jimczi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jan-elastic Mar 6, 2025 •

edited

Loading

jimczi Mar 6, 2025 •

edited

Loading